CLARIFY: Human-Powered Training of SMT Models

نویسندگان

  • Darren Scott Appling
  • Ellen Yi-Luen Do
چکیده

We present CLARIFY, an augmented environment that aims to improve the quality of translations generated by phrase-based statistical machine translation systems through learning from humans. CLARIFY employs four types of knowledge input: 1) direct input 2) results from either a word alignment game, 3) an phrase alignment game, or 4) a paraphrasing game. All of these knowledge inputs elicit user knowledge to improve future system translations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feasibility Study of Building a Human Powered Hydrofoil Vessel

In this paper, a feasibility study of building a Human Powered Hydrofoil (HPH) vessel is reported. Hydrofoil vessels are a well-known class of high-speed crafts. In addition to high-speed operation, the hydrofoils have a reliable maneuvering capability, good stability and proper operation in waves. Also, a human powered vehicle, nowadays is an advancing idea. Different aspects of the design and...

متن کامل

Applying Morphology Generation Models to Machine Translation

We improve the quality of statistical machine translation (SMT) by applying models that predict word forms from their stems using extensive morphological and syntactic information from both the source and target languages. Our inflection generation models are trained independently of the SMT system. We investigate different ways of combining the inflection prediction component with the SMT syst...

متن کامل

Feature Decay Algorithms for Fast Deployment of Accurate Statistical Machine Translation Systems

We use feature decay algorithms (FDA) for fast deployment of accurate statistical machine translation systems taking only about half a day for each translation direction. We develop parallel FDA for solving computational scalability problems caused by the abundance of training data for SMT models and LM models and still achieve SMT performance that is on par with using all of the training data ...

متن کامل

PanDoRA: A Large-scale Two-way Statistical Machine Translation System for Hand-held Devices

The statistical machine translation (SMT) approach has taken a lead place in the field of Machine Translation for its better translation quality and lower cost in training compared to other approaches. However, due to the high demand of computing resources, an SMT system can not be directly run on hand-held devices. Most existing hand-held translation systems are either interlingua-based, which...

متن کامل

Stacking for Statistical Machine Translation

We propose the use of stacking, an ensemble learning technique, to the statistical machine translation (SMT) models. A diverse ensemble of weak learners is created using the same SMT engine (a hierarchical phrase-based system) by manipulating the training data and a strong model is created by combining the weak models on-the-fly. Experimental results on two language pairs and three different si...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009